Substructure Discovery Using Minimum Description Length and Background Knowledge
نویسندگان
چکیده
The ability to identify interesting and repetitive substructures is an essential component to discovering knowledge in structural data. We describe a new version of our Subdue substructure discovery system based on the minimum description length principle. The Subdue system discovers substructures that compress the original data and represent structural concepts in the data. By replacing previously-discovered substructures in the data, multiple passes of Subdue produce a hierarchical description of the structural regularities in the data. Subdue uses a computationally-bounded inexact graph match that identi es similar, but not identical, instances of a substructure and nds an approximate measure of closeness of two substructures when under computational constraints. In addition to the minimumdescription length principle, other background knowledge can be used by Subdue to guide the search towards more appropriate substructures. Experiments in a variety of domains demonstrate Subdue's ability to nd substructures capable of compressing the original data and to discover structural concepts important to the domain.
منابع مشابه
Substructure Discovery Using Minimum Description Length Principle and Background Knowledge
Discovering conceptually interesting and repetitive substructures in a structural data improves the ability to interpret and compress the data. The substructures are evaluated by their ability to describe and compress the original data set using the domain’s background knowledge and the minimum description length (MDL) of the data. Once discovered, the substructure concept is used to simplify t...
متن کاملSubstructure Discovery in the SUBDUE System
Because many databases contain or can be embellished with structural information, a method for identifying interesting and repetitive substructures is an essential component to discovering knowledge in such databases. This paper describes the Subdue system, which uses the minimum description length (MDL) principle to discover sub-structures that compress the database and represent structural co...
متن کاملSubstucture Discovery in the SUBDUE System
Because many databases contain or can be embellished with structural information, a method for identifying interesting and repetitive substructures is an essential component to discovering knowledge in such databases. This paper describes the SUBDUE system, which uses the minimum description length (MDL) principle to discover substructures that compress the database and represent structural con...
متن کاملStructural Knowledge Discovery in Chemical and Spatio-Temporal Databases
Most current knowledge discovery systems use only attribute-value information. But relational information between objects is also important to the knowledge hidden in today’s databases. Two such domains are chemical structures and domains where objects are related in space and time. Inductive Logic Programming (ILP) discovery systems handle relational data, but require data to be expressed as a...
متن کاملGraph Based Concept Learning
Concept Learning is a Machine Learning technique in which the learning process is driven by providing positive and negative examples to the learner. From those examples, the learner builds a hypothesis (concept) that describes the positive examples and excludes the negative examples. Inductive Logic Programming (ILP) systems have successfully been used as concept learners. Examples of those are...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید
ثبت ناماگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید
ورودعنوان ژورنال:
- J. Artif. Intell. Res.
دوره 1 شماره
صفحات -
تاریخ انتشار 1994